This is a nice book for both young and old. It gives beautiful life lessons in a fun way. Definitely worth the money!
+ Educational
+ Funny
+ Price
Nice story for older children.
+ Funny
- ReadabilitySentiment
Sentiment =
Feelings, Attitudes, Emotions, Opinions
A thought, view, or attitude, especially one based mainly on emotion instead of reason
Subjective impressions, not facts
Webster’s Dictionary
Webster’s Dictionary
Scherer Typology of Affective States
- Emotion: brief organically synchronized … evaluation of a major event
- angry, sad, joyful, fearful, ashamed, proud, elated
- Mood: diffuse non-caused low-intensity long-duration change in subjective feeling
- cheerful, gloomy, irritable, listless, depressed, buoyant
- Interpersonal stances: affective stance toward another person in a specific interaction
- friendly, flirtatious, distant, cold, warm, supportive, contemptuous
- Attitudes: enduring, affectively colored beliefs, dispositions towards objects or persons
- liking, loving, hating, valuing, desiring
- Personality traits: stable personality dispositions and typical behavior tendencies
- nervous, anxious, reckless, morose, hostile, jealous
Sentiment Analysis
Sentiment Analysis
- Use of natural language processing (NLP) and computational techniques to automate the extraction or classification of sentiment from typically unstructured text
Opinion mining
Sentiment mining
Subjectivity analysis
Sentiment analysis
can be applied in every topic & domain!
Book: is this review positive or negative?
Humanities:sentiment analysis for German historic plays.
Products: what do people think about the new iPhone?
Blog: how are people thinking about immigrants?
Politics: who is going to win the election?
Twitter: what is trend?
Movie: is this review positive or negative (IMDB, Netflix)?
Marketing: how is consumer confidence? Consumer attitudes?
Healthcare: are patients happy with the hospital environment?
Two main types of opinions
(Jindal and Liu 2006; Liu, 2010)
Regular opinions: Sentiment/opinion expressions on some target entities
Direct opinions:
- “The touch screen is really cool.”
Indirect opinions:
- “After taking the drug, my pain has gone.”
Comparative opinions: Comparison of more than one entity.
- E.g., “iPhone is better than Blackberry.”
Practical definition
(Hu and Liu 2004; Liu, 2010, 2012)
An opinion is a quintuple
( entity, aspect, sentiment, holder, time)
whereentity: target entity (or object).
Aspect: aspect (or feature) of the entity.
Sentiment: +, -, or neu, a rating, or an emotion.
holder: opinion holder.
time: time when the opinion was expressed.
Example
Kindle Customer Reviewed in the United States on August 16, 2015:
This has been my favorite book since I was 14 and had to read it in French as an assignment in school. I fell in love with it and immediately bought the English translation by Katherine Woods, as I knew I would read it many times over the years and I knew my French was not likely to improve. Today I bought this version to have on my Kindle as I was thinking of giving my 40 year old paperback to my best friend. I could not be more disappointed. The changes in this translation take so much away from the book that it almost changes who the Little Prince really is. The charm of the book is completely missing. In one of my favorite parts of the book the fox talks to the Little Prince, sharing his invaluable truth: “what is essential is invisible to the eye.” Howard changes it to “Anything essential is invisible to the eyes”, which changes the entire concept of what is said. “The eye” is every eye, everywhere. Making it plural takes away the meaning of what the fox is really saying. If you want to read this book, if you want to read it to your children, please take my advice and find the Katherine Woods translation, even if it means going to a used book store. I simply cannot understand what Howard was thinking in all of the changes he made to this wonderful story that will stay with you for a lifetime, but only if you read the Woods translation which will open your eyes to the true meaning of the Little Prince. As the fox says: “Words are the source of misunderstandings” and Howardh has changed the words so much that indeed, in this version, words are very much the source of misunderstandings.
Sentiment Analysis
Simplest task:
- Is the attitude of this text positive or negative?
More complex:
- Rank the attitude of this text from 1 to 5
Advanced:
- Detect the target, source, or complex opinion types
- Implicit opinions or aspects
Simple task: Opinion summary
Aspect:
Touch screen
Positive: 212
The
touch screen was really cool.
The
touch screen was so easy to use and can do amazing things.
…
Negative: 6
The
screen is easily scratched.
I have a lot of difficulty in removing finger marks from the
touch screen.
…
Aspect: Size
…
Problem
Which features to use?
- Words (unigrams)
- Phrases/n-grams
- Sentences
How to interpret features for sentiment detection?
- Bag of words (IR)
- Annotated lexicons (WordNet, SentiWordNet)
- Syntactic patterns
- Paragraph structure
Challenges
Harder than topical classification, with which bag of words features perform well
- Must consider other features due to…
- Subtlety of sentiment expression
- irony
- expression of sentiment using neutral words
- Domain/context dependence
- words/phrases can mean different things in different contexts and domains
- Effect of syntax on semantics
- Subtlety of sentiment expression
Approaches for Sentiment Analysis
- Lexicon-based methods (dictionary-based)
Using sentiment words and phrases: good, wonderful, awesome, troublesome, cost an arm and leg
Not completely unsupervised!
- Supervised learning methods: to classify reviews into positive and negative.
- Machine learning
- Naïve Bayes, Maximum Entropy, Support Vector Machine
- Recent research
- Deep learning
- Machine learning
Lexicon-based methods
The General Inquirer
Home page: http://www.wjh.harvard.edu/~inquirer
List of Categories: http://www.wjh.harvard.edu/~inquirer/homecat.htm
- Spreadsheet: http://www.wjh.harvard.edu/~inquirer/inquirerbasic.xls
- Categories:
- Positiv (1915 words) and Negativ (2291 words)
- Strong vs Weak, Active vs Passive, Overstated versus Understated
- Pleasure, Pain, Virtue, Vice, Motivation, Cognitive Orientation, etc
- Categories:
Free for Research Use
Philip J. Stone, Dexter C Dunphy, Marshall S. Smith, Daniel M. Ogilvie. 1966. The General Inquirer: A Computer Approach to Content Analysis. MIT Press
LIWC (Linguistic Inquiry and Word Count)
- Home page: http://www.liwc.net/
2300 words, >70 classes
- Affective Processes
- negative emotion (bad, weird, hate, problem, tough)
- positive emotion (love, nice, sweet)
- Cognitive Processes
- Tentative (maybe, perhaps, guess), Inhibition (block, constraint)
Pronouns, Negation (no, never), Quantifiers (few, many)
Pennebaker, J.W., Booth, R.J., & Francis, M.E. (2007). Linguistic Inquiry and Word Count: LIWC 2007. Austin, TX
MPQA Subjectivity Cues Lexicon
- 6885 words from 8221 lemmas
- 2718 positive
- 4912 negative
Each word annotated for intensity (strong, weak)
GNU GPL
Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (2005). Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. Proc. of HLT-EMNLP-2005.
Riloff and Wiebe (2003). Learning extraction patterns for subjective expressions. EMNLP-2003.
Bing Liu Opinion Lexicon
- Bing Liu’s Page on Opinion Mining
- 6786 words
- 2006 positive
- 4783 negative
Minqing Hu and Bing Liu. Mining and Summarizing Customer Reviews. ACM SIGKDD-2004.
SentiWordNet
Home page: http://sentiwordnet.isti.cnr.it/
All WordNet synsets automatically annotated for degrees of positivity, negativity, and neutrality/objectiveness
- [estimable(J,3)] “may be computed or estimated”
\[\operatorname{Pos\ \ 0\ \ \ Neg\ \ 0\ \ \ Obj\ \ 1} \] [estimable(J,1)] “deserving of respect or high regard” \[\operatorname{Pos\ \ .75\ \ \ Neg\ \ 0\ \ \ Obj\ \ .25} \]
Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. 2010 SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. LREC-2010
Disagreements between polarity lexicons
Christopher Potts, Sentiment Tutorial, 2011
Analyzing the polarity of each word in IMDB
Potts, Christopher. 2011. On the negativity of negation. SALT 20, 636-659.
- How likely is each word to appear in each sentiment class?
- Count(“bad”) in 1-star, 2-star, 3-star, etc.
- But can’t use raw counts:
- Instead, likelihood: \(P(w|c) = \frac{f(w,c)}{\sum_{w \in c}{f(w,c)}}\)
- Make them comparable between words
- Scaled likelihood: \(\frac{P(w|c)}{P(w)}\)
Analyzing the polarity of each word in IMDB
Potts, Christopher. 2011. On the negativity of negation. SALT 20, 636-659.
Other sentiment feature: Logical negation
Potts, Christopher. 2011. On the negativity of negation. SALT 20, 636-659.
Is logical negation (no, not) associated with negative sentiment?
- Potts experiment:
- Count negation (not, n’t, no, never) in online reviews
- Regress against the review rating
Potts 2011 Results:
More negation in negative sentiment
Semi-supervised learning of lexicons
- Use a small amount of information
- A few labeled examples
- A few hand-built patterns
- To bootstrap a lexicon
Hatzivassiloglou and McKeown intuition for identifying word polarity
Vasileios Hatzivassiloglou and Kathleen R. McKeown. 1997. Predicting the Semantic Orientation of Adjectives. ACL, 174–181
- Adjectives conjoined by “and” have same polarity
- Fair and legitimate, corrupt and brutal
- *fair and brutal, *corrupt and legitimate
- Adjectives conjoined by “but” do not
- fair but brutal
Hatzivassiloglou & McKeown 1997
Step 1
- Label seed set of 1336 adjectives
(all >20 in 21 million word WSJ corpus)
- 657 positive
- adequate central clever famous intelligent remarkable reputed sensitive slender thriving…
- 679 negative
- contagious drunken ignorant lanky listless primitive strident troublesome unresolved unsuspecting…
- 657 positive
Hatzivassiloglou & McKeown 1997
Step 2
- Expand seed set to conjoined adjectives
Hatzivassiloglou & McKeown 1997
Step 3
- Supervised classifier assigns “polarity similarity” to each word pair, resulting in graph:
Hatzivassiloglou & McKeown 1997
Step 4
- Clustering for partitioning the graph into two
Output polarity lexicon
- Positive
- bold decisive disturbing generous good honest important large mature patient peaceful positive proud sound stimulating straightforward strange talented vigorous witty…
- Negative
- ambiguous cautious cynical evasive harmful hypocritical inefficient insecure irrational irresponsible minor outspoken pleasant reckless risky selfish tedious unsupported vulnerable wasteful…
Turney Algorithm
Potts, Christopher. 2011. On the negativity of negation. SALT 20, 636-659.
- Extract a phrasal lexicon from reviews
- Learn polarity of each phrase
- Rate a review by the average polarity of its phrases
Extract two-word phrases with adjectives
How to measure polarity of a phrase?
Positive phrases co-occur more with “excellent”
Negative phrases co-occur more with “poor”
But how to measure co-occurrence?
Pointwise Mutual Information
- Mutual information between 2 random variables X and Y
\[I(X,Y) = \sum_X \sum_Y{P(x,y)log_2{\frac{P(x,y)}{P(x)P(y)}}}\]
- Pointwise mutual information:
- How much more do events x and y co-occur than if they were independent?
\[PMI(X,Y)=log_2{\frac{P(x,y)}{P(x)P(y)}}\]
Pointwise Mutual Information
- Pointwise mutual information:
- How much more do events x and y co-occur than if they were independent?
\[PMI(X,Y)=log_2{\frac{P(x,y)}{P(x)P(y)}}\]
- PMI between two words:
- How much more do two words co-occur than if they were independent?
\[PMI(word_1,woprd_2)=log_2{\frac{P(word_1,word_2)}{P(word_1)P(word_2)}}\]
How to Estimate Pointwise Mutual Information
- Query search engine (Altavista)
- P(word) estimated by
hits(word)/N - P(word1,word2) by
hits(word1 NEAR word2)/N^2
- P(word) estimated by
\[PMI(word_1,woprd_2)=log_2{\frac{hits(word_1 \: \mathrm{NEAR} \: word_2)}{hits(word_1)hits(word_2)}}\]
Does phrase appear more with “poor” or “excellent”?
\[ \begin{align} \mathrm{Polarity}(phrase) = \mathrm{PMI}(pharse, \mathrm{"excellent"}) - \mathrm{PMI}(pharse, \mathrm{"poor"}) \\ \\ = log_2{\frac{hits(phrase \: \mathrm{NEAR} \: \mathrm{"excellent"})}{hits(phrase)hits(\mathrm{"excellent"})}} - log_2{\frac{hits(phrase \: \mathrm{NEAR} \: \mathrm{"poor"})}{hits(phrase)hits(\mathrm{"poor"})}} \\ \\ = log_2{\frac{hits(phrase \: \mathrm{NEAR} \: \mathrm{"excellent"})}{hits(phrase)hits(\mathrm{"excellent"})}} {\frac{hits(phrase)hits(\mathrm{"poor"})}{hits(phrase \: \mathrm{NEAR} \: \mathrm{"poor"})}} \\ \\ = log_2{(\frac{hits(phrase \: \mathrm{NEAR} \: \mathrm{"excellent"}) hits(\mathrm{"poor"})}{hits(phrase \: \mathrm{NEAR} \: \mathrm{"poor"}) hits(\mathrm{"excellent"})})} \end{align} \]
Phrases from a thumbs-up review
| Phrase | POS.tags | Polarity |
|---|---|---|
| online service | JJ NN | 2.8 |
| online experience | JJ NN | 2.3 |
| direct deposit | JJ NN | 1.3 |
| local branch | JJ NN | 0.42 |
| … | ||
| low fees | JJ NNS | 0.33 |
| true service | JJ NN | -0.73 |
| other bank | JJ NN | -0.85 |
| inconveniently located | JJ NN | -1.5 |
| Average | 0.32 |
Phrases from a thumbs-down review
| Phrase | POS.tags | Polarity |
|---|---|---|
| direct deposits | JJ NNS | 5.8 |
| online web | JJ NN | 1.9 |
| very handy | RB JJ | 1.4 |
| … | ||
| virtual monopoly | JJ NN | -2 |
| lesser evil | RBR JJ | -2.3 |
| other problems | JJ NNS | -2.8 |
| low funds | JJ NNS | -6.8 |
| unethical practices | JJ NNS | -8.5 |
| Average | -1.2 |
Results of Turney algorithm
- 410 reviews from Epinions
- 170 (41%) negative
- 240 (59%) positive
- Majority class baseline: 59%
- Turney algorithm: 74%
- Phrases rather than words
- Learns domain-specific information
Using WordNet to learn polarity
S.M. Kim and E. Hovy. 2004. Determining the sentiment of opinions. COLING 2004
M. Hu and B. Liu. Mining and summarizing customer reviews. In Proceedings of KDD, 2004
- WordNet: online thesaurus (covered in later lecture).
- Create positive (“good”) and negative seed-words (“terrible”)
- Find Synonyms and Antonyms
- Positive Set: Add synonyms of positive words (“well”) and antonyms of negative words
- Negative Set: Add synonyms of negative words (“awful”) and antonyms of positive words (”evil”)
- Repeat, following chains of synonyms
- Filter
Summary on Learning Lexicons
- Advantages:
- Can be domain-specific
- Can be more robust (more words)
- Intuition
- Start with a seed set of words (‘good’, ‘poor’)
- Find other words that have similar polarity:
- Using “and” and “but”
- Using words that occur nearby in the same document
- Using WordNet synonyms and antonyms
Supervised methods
Document sentiment classification
-
Classify a whole opinion document (e.g., a review) based on the overall sentiment of the opinion holder
(Pang et al 2002; Turney 2002)
- Classes: Positive, negative (possibly neutral)
-
An example review:
- “I bought an iPhone a few days ago. It is such a nice phone, although a little large. The touch screen is cool. The voice quality is great too. I simply love it!”
- Classification: positive or negative?
- It is basically a text classification problem
Sentence sentiment analysis
-
Classify the sentiment expressed in a sentence
- Classes: positive, negative, neutral
- Neutral means no sentiment expressed
- “I believe he went home yesterday.”
- “I bought a iPhone yesterday”
-
But bear in mind
- Explicit opinion: “I like this car.”
- Fact-implied opinion: “I bought this car yesterday and it broke today.”
- Mixed opinion: “Apple is doing well in this poor economy”
Features for supervised learning
The problem has been studied by numerous researchers.
-
Key: feature engineering. A large set of features have been tried by researchers. E.g.,
- Terms frequency and different IR weighting schemes
- Part of speech (POS) tags
- Opinion words and phrases
- Negations
- Syntactic dependency
Approaches
- Machine learning
Naïve Bayes (Assume pairwise independent features)
Maximum Entropy Classifier (Assume pairwise independent features)
SVM
- Markov Blanket Classifier
- Accounts for conditional feature dependencies
- Allowed reduction of discriminating features from thousands of words to about 20 (movie review domain)
Sentiment Classification in Movie Reviews
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP-2002, 79—86.
Bo Pang and Lillian Lee. 2004. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. ACL, 271-278
- Polarity detection:
- Is an IMDB movie review positive or negative?
- Data: Polarity Data 2.0:
Baseline Algorithm (adapted from Pang and Lee)
Tokenization
Feature Extraction
- Classification using different classifiers
- Naïve Bayes
- MaxEnt
- SVM
Sentiment Tokenization Issues
Deal with HTML and XML markup
Twitter mark-up (names, hash tags)
Capitalization (preserve forwords in all caps)
Phone numbers, dates
Emoticons
- Useful code:
Extracting Features for Sentiment Classification
How to handle negation
- I didn’t like this movie
vs - I really like this movie
- I didn’t like this movie
Which words to use?
Only adjectives
All words
- All words turns out to work better, at least on this data
Negation
Das, Sanjiv and Mike Chen. 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. In Proceedings of the Asia Pacific Finance Association Annual Conference (APFA).
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP-2002, 79—86.
Add NOT_ to every word between negation and following punctuation:
Reminder: Naïve Bayes
\[C_{NB} = \underset{c_j \in C}{\operatorname{argmax}}P(c_j) \prod_{i \in positions}{P(w_i|c_i)} \]
\[\hat{P}(w|c) = \frac{count(w,c) + 1}{count(c) + |V|}\]
Cross-Validation
Break up data into 10 folds
- (Equal positive and negative inside each fold?)
For each fold
Choose the fold as a temporary test set
Train on 9 folds, compute performance on the test fold
- Report average performance of the 10 runs
Supervised Sentiment Analysis
Negation is important
Using all words (in naïve bayes) works well for some tasks
- Finding subsets of words may help in other tasks
- Hand-built polarity lexicons
- Use seeds and semi-supervised learning to induce lexicons
Other challenges in SA
Explicit and implicit aspects
(Hu and Liu, 2004)
Explicit aspects: Aspects explicitly mentioned as nouns or noun phrases in a sentence
- “The picture quality is of this phone is great.”
-
Implicit aspects: Aspects not explicitly mentioned in a sentence but are implied
- “This car is so expensive.”
- “This phone will not easily fit in a pocket.”
- “Included 16MB is stingy.”
Some work has been done (Su et al. 2009; Hai et al 2011)
Explicit Opinions
Bagheri et al. 2013
Some interesting sentences
“ Trying out Chrome because Firefox keeps crashing.”
Firefox - negative; no opinion about chrome.
We need to segment the sentence into clauses to decide that “crashing” only applies to Firefox(?).
- But how about these
“ I changed to Audi because BMW is so expensive.”
“ I did not buy BWM because of the high price.”
“ I am so happy that my iPhone is nothing like my old ugly Droid.”
Some interesting sentences (contd)
These two sentences are from paint reviews.
“ For paintX, one coat can cover the wood color.”
“ For paintY, we need three coats to cover the wood color
We know that paintX is good and paintY is not, but how, by a system.
“My goal is to get a tv with good picture quality”
“The top of the picture was brighter than the bottom.”
“When I first got the airbed a couple of weeks ago it was wonderful as all new things are, however as the weeks progressed I liked it less and less.”
Some interesting sentences (contd)
Conditional sentences are hard to deal with (Narayanan et al. 2009)
“ If I can find a good camera, I will buy it.”
But conditional sentences can have opinions
- “ If you are looking for a good phone, buy Nokia”
Questions are also hard to handle
“ Are there any great perks for employees?”
“ Any idea how to fix this lousy Sony camera?”
Some interesting sentences (contd)
Sarcastic sentences
- “ What a great car, it stopped working in the second day.”
Sarcastic sentences are common in political blogs, comments and discussions.
- They make political opinions difficult to handle
Multiclass and Multilabel Classification
Multi-class classification
Sentiment: Positive, Negative, Neutral
Emotion: angry, sad, joyful, fearful, ashamed, proud, elated
Disease: Healthy, Cold, Flu
Weather: Sunny, Cloudy, Rain, Snow
One-vs-all (one-vs-rest)
One-vs-all
While some classification algorithms naturally permit the use of more than two classes and/or labels, others are by nature binary algorithms; these can, however, be turned into multinomial classifiers by a variety of strategies.
A common strategy is one-vs-all, which involves training a single classifier per class, with the samples of that class as positive samples and all other samples as negatives.
One-vs-all
Train a logistic regression classifier \(h_\theta^{(i)}(x)\) for each class \(i\) to predict the probability that \(y=i\)
Given a new input \(x\), pick the class \(i\) that maximizes
\[\max_i{h_\theta^{(i)}(x)}\]
Ex:
Naïve Bayes
Estimate \(P(Y)\) and \(P(X|Y)\)
Prediction
\[\hat{y} = \underset{y}{\operatorname{argmax}}P(Y = y)P(X = x|Y = y)\]
Ex:
Logistic regression
Estimate \(P(Y|X)\) directly
(Or a discriminant function: e.g., SVM)
Prediction
\[\hat{y} = P(Y = y|X = x)\]
Classification
- Multiclass classification is the task of classifying instances into one of three or more classes. Classifying instances into one of two classes is called binary classification. Multiclass classification should not be confused with multi-label classification, where multiple labels are to be predicted for each instance.
Multi-label classification
In multiclass, one-vs-all requires the base classifiers to produce a real-valued score for its decision, rather than just a class label. Then, the final label is the one corresponding to the class with the highest score.
In multilabel, this strategy predicts all labels for this sample for which the respective classifiers predict a positive result.
Summary
Summary: what did we learn?
Sentiment Analysis
Multiclass and Multi-label classification